24 research outputs found

    Computing the probability for data loss in two-dimensional parity RAIDs

    Get PDF
    Parity RAIDs are used to protect storage systems against disk failures. The idea is to add redundancy to the system by storing the parity of subsets of disks on extra parity disks. A simple two-dimensional scheme is the one in which the data disks are arranged in a rectangular grid, and every row and column is extended by one disk which stores the parity of it. In this paper we describe several two-dimensional parity RAIDs and analyse, for each of them, the probability for data loss given that f random disks fail. This probability can be used to determine the overall probability using the model of Hafner and Rao. We reduce subsets of the forest counting problem to the different cases and show that the generalised problem is #Phard. Further we adapt an exact algorithm by Stones for some of the problems whose worst-case runtime is exponential, but which is very efficient for small fixed f and thus sufficient for all real-world applications

    Computing the Probability for Data Loss in Two-Dimensional Parity RAIDs

    Get PDF
    Parity RAIDs are used to protect storage systems against disk failures. The idea is to add redundancy to the system by storing the parity of subsets of disks on extra parity disks. A simple two-dimensional scheme is the one in which the data disks are arranged in a rectangular grid, and every row and column is extended by one disk which stores the parity of it. In this paper we describe several two-dimensional parity RAIDs and analyse, for each of them, the probability for data loss given that f random disks fail. This probability can be used to determine the overall probability using the model of Hafner and Rao. We reduce subsets of the forest counting problem to the different cases and show that the generalised problem is #Phard. Further we adapt an exact algorithm by Stones for some of the problems whose worst-case runtime is exponential, but which is very efficient for small fixed f and thus sufficient for all real-world applications

    Evaluation of SLA-based decision strategies for VM scheduling in cloud data centers

    Get PDF
    Copyright © 2016 held by owner/author(s). Service level agreements (SLAs) gain more and more importance in the area of cloud computing. An SLA is a contract between a customer and a cloud service provider (CSP) in which the CSP guarantees functional and non-functional quality of service parameters for cloud services. Since CSPs have to pay for the hardware used as well as penalties for violating SLAs, they are eager to fulfill these agreements while at the same time optimizing the utilization of their resources. In this paper we examine SLA-aware VM scheduling strategies for cloud data centers. The service level objectives considered are resource usage and availability. The sample resources are CPU and RAM. They can be overprovisioned by the CSPs which is the main leverage to increase their revenue. The availability of a VM is affected by migrating it within and between data centers. To get realistic results, we simulate the effect of the strategies using the FederatedCloudSim framework and real-world workload traces of business-critical VMs. Our evaluation shows that there are considerable differences between the scheduling strategies in terms of SLA violations and the number of migrations. From all strategies considered, the combination of the Minimization of Migrations strategy for VM selection and the Worst Fit strategy for host selection achieves the best results

    And now for something completely different: running Lisp on GPUs

    Get PDF
    The internal parallelism of compute resources increases permanently, and graphics processing units (GPUs) and other accelerators have been gaining importance in many domains. Researchers from life science, bioinformatics or artificial intelligence, for example, use GPUs to accelerate their computations. However, languages typically used in some of these disciplines often do not benefit from the technical developments because they cannot be executed natively on GPUs. Instead existing programs must be rewritten in other, less dynamic programming languages. On the other hand, the gap in programming features between accelerators and common CPUs shrinks permanently. Since accelerators are becoming more competitive with regard to general computations, they will not be mere special-purpose processors in the future. It is a valid assumption that future GPU generations can be used in a similar or even the same way as CPUs and that compilers or interpreters will be needed for a wider range of computer languages. We present CuLi, an interactive Lisp interpreter, that performs all computations on a CUDA-capable GPU. The host system is needed only for the input and the output. At the moment, Lisp programs running on CPUs outperform Lisp programs on GPUs, but we present trends indicating that this might change in the future. Our study gives an outlook on the possibility of running Lisp programs or other dynamic programming languages on next-generation accelerators

    Hyperion: Building the largest in-memory search tree

    Get PDF
    Indexes are essential in data management systems to increase the speed of data retrievals. Widespread data structures to provide fast and memory-efficient indexes are prefix tries. Implementations like Judy, ART, or HOT optimize their internal alignments for cache and vector unit efficiency. While these measures usually improve the performance substantially, they can have a negative impact on memory efficiency. In this paper we present Hyperion, a trie-based main-memory key-value store achieving extreme space efficiency. In contrast to other data structures, Hyperion does not depend on CPU vector units, but scans the data structure linearly. Combined with a custom memory allocator, Hyperion accomplishes a remarkable data density while achieving a competitive point query and an exceptional range query performance. Hyperion can significantly reduce the index memory footprint, while being at least two times better concerning the performance to memory ratio compared to the best implemented alternative strategies for randomized string data sets

    Financial evaluation of SLA-based VM scheduling strategies for cloud federations

    Get PDF
    In recent years, cloud federations have gained popularity. Small as well as big cloud service providers (CSPs) join federations to reduce their costs, and also cloud management software like OpenStack offers support for federations. In a federation, individual CSPs cooperate such that they can move load to partner clouds at high peaks and possibly offer a wider range of services to their customers. Research in this area addresses the organization of such federations and strategies that CSPs can apply to increase their profit. In this paper we present the latest extensions to the FederatedCloudSim framework that considerably improve the simulation and evaluation of cloud federations. These simulations include service-level agreements (SLAs), scheduling and brokering strategies on various levels, the use of real-world cloud workload traces and a fine-grained financial evaluation using the new CloudAccount module. We use FederatedCloudSim to compare scheduling and brokering strategies on the federation level. Among them are new strategies that conduct auctions or consult a reliance factor to select an appropriate federated partner for running outsourced virtual machines. Our results show that choosing the right strategy has a significant impact on SLA compliance and revenue

    Smart Grid-aware scheduling in data centres

    Get PDF
    © 2016 In several countries the expansion and establishment of renewable energies result in widely scattered and often weather-dependent energy production, decoupled from energy demand. Large, fossil-fuelled power plants are gradually replaced by many small power stations that transform wind, solar and water power into electrical power. This leads to changes in the historically evolved power grid that favours top-down energy distribution from a backbone of large power plants to widespread consumers. Now, with the increase of energy production in lower layers of the grid, there is also a bottom-up flow of the grid infrastructure compromising its stability. In order to locally adapt the energy demand to the production, some countries have started to establish Smart Grids to incentivise customers to consume energy when it is generated. This paper investigates how data centres can benefit from variable energy prices in Smart Grids. In view of their low average utilisation, data centre providers can schedule the workload dependent on the energy price. We consider a scenario for a data centre in Paderborn, Germany, hosting a large share of interruptible and migratable computing jobs. We suggest and compare two scheduling strategies for minimising energy costs. The first one merely uses current values from the Smart Meter to place the jobs, while the other one also estimates the future energy price in the grid based on weather forecasts. In spite of the complexity of the prediction problem and the inaccuracy of the weather data, both strategies perform well and have a strong positive effect on the utilisation of renewable energy and on the reduction of energy costs. This work improves and extends the paper of the same title published on the SustainIT conference (Mäsker et al., 2015). While that paper puts more emphasis on the utilisation of green energy, the new algorithms find a better balance between energy costs and turnaround time. We slightly alter the scenario using a more realistic multi-queue batch system and improve the scheduling algorithms which can be tuned to prioritise turnaround time or green energy utilisation

    Pure functions in C: A small keyword for automatic parallelization

    Get PDF
    © 2020, The Author(s). The need for parallel task execution has been steadily growing in recent years since manufacturers mainly improve processor performance by increasing the number of installed cores instead of scaling the processor’s frequency. To make use of this potential, an essential technique to increase the parallelism of a program is to parallelize loops. Several automatic loop nest parallelizers have been developed in the past such as PluTo. The main restriction of these tools is that the loops must be statically analyzable which, among other things, disallows function calls within the loops. In this article, we present a seemingly simple extension to the C programming language which marks functions without side-effects. These functions can then basically be ignored when the automatic parallelizer checks the parallelizability of loops. We integrated the approach into the GCC compiler toolchain and evaluated it by running several real-world applications. Our experiments show that the C extension helps to identify additional parallelization opportunities and, thus, to significantly increase the performance of applications

    Deduplication potential of HPC applications' checkpoints

    Get PDF
    © 2016 IEEE. HPC systems contain an increasing number of components, decreasing the mean time between failures. Checkpoint mechanisms help to overcome such failures for long-running applications. A viable solution to remove the resulting pressure from the I/O backends is to deduplicate the checkpoints. However, there is little knowledge about the potential to save I/Os for HPC applications by using deduplication within the checkpointing process. In this paper, we perform a broad study about the deduplication behavior of HPC application checkpointing and its impact on system design

    Pure functions in C: A small keyword for automatic parallelization

    Get PDF
    © 2017 IEEE. The need for parallel task execution has been steadily growing in recent years since manufacturers mainly improve processor performance by scaling the number of installed cores instead of the frequency of processors. To make use of this potential, an essential technique to increase the parallelism of a program is to parallelize loops. However, a main restriction of available tools for automatic loop parallelization is that the loops often have to be 'polyhedral' and that it is, e.g., not allowed to call functions from within the loops.In this paper, we present a seemingly simple extension to the C programming language which marks functions without side-effects. These functions can then basically be ignored when checking the parallelization opportunities for polyhedral loops. We extended the GCC compiler toolchain accordingly and evaluated several real-world applications showing that our extension helps to identify additional parallelization chances and, thus, to significantly enhance the performance of applications
    corecore